12 research outputs found

    Android malware detection using machine learning to mitigate adversarial evasion attacks

    Get PDF
    In the current digital era, smartphones have become indispensable. Over the past few years, the exponential growth of Android users has made this operating system (OS) a prime target for smartphone malware. Consequently, the arms race between Android security personnel and malware developers seems enduring. Considering Machine Learning (ML) as the core component, various techniques are proposed in the literature to counter Android malware, however, the problem of adversarial evasion attacks on ML-based malware classifiers is understated. MLbased techniques are vulnerable to adversarial evasion attacks. The malware authors constantly try to craft adversarial examples to elude existing malware detection systems. This research presents the fragility of ML-based Android malware classifiers in adversarial environments and proposes novel techniques to counter adversarial evasion attacks on ML based Android malware classifiers. First, we start our analysis by introducing the problem of Android malware detection in adversarial environments and provide a comprehensive overview of the domain. Second, we highlight the problem of malware clones in popular Android malware repositories. The malware clones in the datasets can potentially lead to biased results and computational overhead. Although many strategies are proposed in the literature to detect repackaged Android malware, these techniques require burdensome code inspection. Consequently, we employ a lightweight and novel strategy based on package names reusing to identify repackaged Android malware and build a clones-free Android malware dataset. Furthermore, we investigate the impact of repacked Android malware on various ML-based classifiers by training them on a clones free training set and testing on a set of benign, non repacked malware and all the malware clones in the dataset. Although trained on a reduced train set, we achieved up to 98.7% F1 score. Third, we propose Cure-Droid, an Android malware classification model trained on hybrid features and optimized using a tree-based pipeline optimization technique (TPoT). Fourth, to present the fragility of Cure- Droid model in adversarial environments, we formulate multiple adversarial evasion attacks to elude the model. Fifth, to counter adversarial evasion attacks on ML-based Android malware detectors, we propose CureDroid*, a novel and adversarially aware Android malware classification model. CureDroid* is based on an ensemble of ML-based models trained on distinct set of features where each model has the individual capability to detect Android malware. The CureDroid* model employs an ensemble of five ML-based models where each model is selected and optimized using TPoT. Our experimental results demonstrate that CureDroid* achieves up to 99.2% accuracy in non-adversarial settings and can detect up to 30 fabricated input features in the best case. Finally, we propose TrickDroid, a novel cumulative adversarial training framework based on Oracle and GAN-based adversarial data. Our experimental results present the efficacy of TrickDroid with up to 99.46% evasion detection

    Deep reinforcement learning based Evasion Generative Adversarial Network for botnet detection

    Get PDF
    Botnet detectors based on machine learning are potential targets for adversarial evasion attacks. Several research works employ adversarial training with samples generated from generative adversarial nets (GANs) to make the botnet detectors adept at recognising adversarial evasions. However, the synthetic evasions may not follow the original semantics of the input samples. This paper proposes a novel GAN model leveraged with deep reinforcement learning (DRL) to explore semantic aware samples and simultaneously harden its detection. A DRL agent is used to attack the discriminator of the GAN that acts as a botnet detector. The agent trains the discriminator on the crafted perturbations during the GAN training, which helps the GAN generator converge earlier than the case without DRL. We name this model RELEVAGAN, i.e. [“relieve a GAN” or deep REinforcement Learning-based Evasion Generative Adversarial Network] because, with the help of DRL, it minimises the GAN's job by letting its generator explore the evasion samples within the semantic limits. During the GAN training, the attacks are conducted to adjust the discriminator weights for learning crafted perturbations by the agent. RELEVAGAN does not require adversarial training for the ML classifiers since it can act as an adversarial semantic-aware botnet detection model. The code will be available at https://github.com/rhr407/RELEVAGAN

    Android Malware Classification Using Machine Learning and Bio-Inspired Optimisation Algorithms

    Get PDF
    In recent years the number and sophistication of Android malware have increased dramatically. A prototype framework which uses static analysis methods for classification is proposed which employs two feature sets to classify Android malware, permissions declared in the AndroidManifest.xml and Android classes used from the Classes.dex file. The extracted features were then used to train a variety of machine learning algorithms including Random Forest, SGD, SVM and Neural networks. Each machine learning algorithm was subsequently optimised using optimisation algorithms, including the use of bio-inspired optimisation algorithms such as Particle Swarm Optimisation, Artificial Bee Colony optimisation (ABC), Firefly optimisation and Genetic algorithm. The prototype framework was tested and evaluated using three datasets. It achieved a good accuracy of 95.7 percent by using SVM and ABC optimisation for the CICAndMal2019 dataset, 94.9 percent accuracy (with f1- score of 96.7 percent) using Neural network for the KuafuDet dataset and 99.6 percent accuracy using an SGD classifier for the Andro-Dump dataset. The accuracy could be further improved through better feature selection

    An Investigation on Fragility of Machine Learning Classifiers in Android Malware Detection

    Get PDF
    Machine learning (ML) classifiers have been increasingly used in Android malware detection and countermeasures for the past decade. However, ML based solutions are vulnerable to adversarial evasion attacks. An attacker can craft a malicious sample carefully to fool an underlying pre-trained classifier. In this paper, we highlight the fragility of the ML classifiers against adversarial evasion attacks. We perform mimicry attacks based on Oracle and Generative Adversarial Network (GAN) against these classifiers using our proposed methodology. We use static analysis on Android applications to extract API-based features from a balanced excerpt of a well-known public dataset. The empirical results demonstrate that among ML classifiers, the detection capability of linear classifiers can be reduced as low as 0 by perturbing only up to 4 out of 315 extracted API features. As a countermeasure, we propose TrickDroid, a cumulative adversarial training scheme based on Oracle and GAN-based adversarial data to improve evasion detection. The experimental results of cumulative adversarial training achieves a remarkable detection accuracy of up to 99.46 against adversarial samples

    Security Hardening of Botnet Detectors Using Generative Adversarial Networks

    Get PDF
    Machine learning (ML) based botnet detectors are no exception to traditional ML models when it comes to adversarial evasion attacks. The datasets used to train these models have also scarcity and imbalance issues. We propose a new technique named Botshot , based on generative adversarial networks (GANs) for addressing these issues and proactively making botnet detectors aware of adversarial evasions. Botshot is cost-effective as compared to the network emulation for botnet traffic data generation rendering the dedicated hardware resources unnecessary. First, we use the extended set of network flow and time-based features for three publicly available botnet datasets. Second, we utilize two GANs (vanilla, conditional) for generating realistic botnet traffic. We evaluate the generator performance using classifier two-sample test (C2ST) with 10-fold 70-30 train-test split and propose the use of ’recall’ in contrast to ’accuracy’ for proactively learning adversarial evasions. We then augment the train set with the generated data and test using the unchanged test set. Last, we compare our results with benchmark oversampling methods with augmentation of additional botnet traffic data in terms of average accuracy, precision, recall and F1 score over six different ML classifiers. The empirical results demonstrate the effectiveness of the GAN-based oversampling for learning in advance the adversarial evasion attacks on botnet detectors

    Evasion Generative Adversarial Network for Low Data Regimes

    Get PDF
    A myriad of recent literary works has leveraged generative adversarial networks (GANs) to generate unseen evasion samples. The purpose is to annex the generated data with the original train set for adversarial training to improve the detection performance of machine learning (ML) classifiers. The quality of generated adversarial samples relies on the adequacy of training data samples. However, in low data regimes like medical diagnostic imaging and cybersecurity, the anomaly samples are scarce in number. This paper proposes a novel GAN design called Evasion Generative Adversarial Network (EVAGAN) that is more suitable for low data regime problems that use oversampling for detection improvement of ML classifiers. EVAGAN not only can generate evasion samples, but its discriminator can act as an evasion-aware classifier. We have considered Auxiliary Classifier GAN (ACGAN) as a benchmark to evaluate the performance of EVAGAN on cybersecurity (ISCX-2014, CIC-2017 and CIC2018) botnet and computer vision (MNIST) datasets. We demonstrate that EVAGAN outperforms ACGAN for unbalanced datasets with respect to detection performance, training stability and time complexity. EVAGAN’s generator quickly learns to generate the low sample class and hardens its discriminator simultaneously. In contrast to ML classifiers that require security hardening after being adversarially trained by GAN-generated data, EVAGAN renders it needless. The experimental analysis proves that EVAGAN is an efficient evasion hardened model for low data regimes for the selected cybersecurity and computer vision datasets. Code will be available at HTTPS://www.github.com/rhr407/EVAGAN

    ڈپٹی نذیر احمد کے اسم بامسمیٰ کردار

    No full text
    Molvi Nazeer Ahmed is one of the greatest Urdu Novelists. He was considered as a first Urdu Novelist. His novels bear mostly the themes of troubles, ignorance and unawareness prevailing in his era. He was anxious about education of Muslim women and their lot in general; their ignorance and other problems. He gave the idea of a prefect woman being a practical and well-educated and presented them as guides for juvenile girls. His unmatchable novel, Mirat-ul-Uroos (The bride's mirror) comprises such subjects that endorse the foundation of female literacy in Muslim and Indian society, and is recognized for giving birth to an entire genre of fictional works encouraging female education in Urdu. It was his art of characterization that placed his novels in front line of Urdu Novels. Most of his characters are charactonym that means giving the name of a fictional character in such a way that the given name itself reveals the personality of a character. The present study critically analyzes his adaption of technique of charactonym in his novels. Being a social and religious reformer, he used this technique to get the purpose of reformation. By presenting the personal and internal life of a character, he depicts the true picture of human nature

    Mitigating Malicious Adversaries Evasion Attacks in Industrial Internet of Things

    Get PDF
    With advanced 5 G/6 G networks, data-driven interconnected devices will increase exponentially. As a result, the Industrial Internet of Things (IIoT) requires data secure information extraction to apply digital services, medical diagnoses and financial forecasting. This introduction of high-speed network mobile applications will also adapt. As a consequence, the scale and complexity of Android malware are rising. Detection of malware classification vulnerable to attacks. A fabricate feature can force misclassification to produce the desired output. This study proposes a subset feature selection method to evade fabricated attacks in the IIOT environment. The method extracts application-aware features from a single android application to train an independent classification model. Ensemble-based learning is then used to train the distinct classification models. Finally, the collaborative ML classifier makes independent decisions to fight against adversarial evasion attacks. We compare and evaluate the benchmark Android malware dataset. The proposed method achieved 91% accuracy with 14 fabricated input features

    Evasion Generative Adversarial Network for Low Data Regimes

    Get PDF
    A myriad of recent literary works has leveraged generative adversarial networks (GANs) to generate unseen evasion samples. The purpose is to annex the generated data with the original train set for adversarial training to improve the detection performance of machine learning (ML) classifiers. The quality of generated adversarial samples relies on the adequacy of training data samples. However, in low data regimes like medical diagnostic imaging and cybersecurity, the anomaly samples are scarce in number. This paper proposes a novel GAN design called Evasion Generative Adversarial Network (EVAGAN) that is more suitable for low data regime problems that use oversampling for detection improvement of ML classifiers. EVAGAN not only can generate evasion samples, but its discriminator can act as an evasion-aware classifier. We have considered Auxiliary Classifier GAN (ACGAN) as a benchmark to evaluate the performance of EVAGAN on cybersecurity (ISCX-2014, CIC-2017 and CIC2018) botnet and computer vision (MNIST) datasets. We demonstrate that EVAGAN outperforms ACGAN for unbalanced datasets with respect to detection performance, training stability and time complexity. EVAGAN’s generator quickly learns to generate the low sample class and hardens its discriminator simultaneously. In contrast to ML classifiers that require security hardening after being adversarially trained by GAN-generated data, EVAGAN renders it needless. The experimental analysis proves that EVAGAN is an efficient evasion hardened model for low data regimes for the selected cybersecurity and computer vision datasets. Code will be available at HTTPS://www.github.com/rhr407/EVAGAN
    corecore